29 research outputs found

    Classification of Large-Scale High-Resolution SAR Images with Deep Transfer Learning

    Get PDF
    The classification of large-scale high-resolution SAR land cover images acquired by satellites is a challenging task, facing several difficulties such as semantic annotation with expertise, changing data characteristics due to varying imaging parameters or regional target area differences, and complex scattering mechanisms being different from optical imaging. Given a large-scale SAR land cover dataset collected from TerraSAR-X images with a hierarchical three-level annotation of 150 categories and comprising more than 100,000 patches, three main challenges in automatically interpreting SAR images of highly imbalanced classes, geographic diversity, and label noise are addressed. In this letter, a deep transfer learning method is proposed based on a similarly annotated optical land cover dataset (NWPU-RESISC45). Besides, a top-2 smooth loss function with cost-sensitive parameters was introduced to tackle the label noise and imbalanced classes' problems. The proposed method shows high efficiency in transferring information from a similarly annotated remote sensing dataset, a robust performance on highly imbalanced classes, and is alleviating the over-fitting problem caused by label noise. What's more, the learned deep model has a good generalization for other SAR-specific tasks, such as MSTAR target recognition with a state-of-the-art classification accuracy of 99.46%

    A Novel Deep Learning Framework Based on Transfer Learning and Joint Time-Frequency Analysis

    Get PDF
    We propose a novel SAR-specific deep learning framework Deep SAR-Net (DSN) for complex-valued SAR images based on transfer learning and joint time-frequency analysis. Conventional methods for deep convolutional neural networks usually take the amplitude information of single-polarization SAR images as input to learn hierarchical spatial features automatically, which may have difficulties in discriminating objects with similar texture but with discriminative scattering patterns. As a result, we analyzed complex-valued SAR images to learn both spatial texture information and the backscattering patterns of objects on the ground. Firstly, we experimented on a large-scale SAR land cover dataset collected from TerraSAR-X images, with a hierarchical three-level annotation of 150 categories and comprising more than 100,000 image patches. With three main challenges of highly imbalanced classes, geographic diversity, and label noise, in automatically interpreting the dataset, a deep transfer learning method based on a similarly annotated optical land cover dataset (NWPU-RESISC45) was used to learn a deep Residual convolutional neural network, optimizing a combined top-2 smooth loss function with cost-sensitive parameters. Rather than applying the ImageNet pre-trained model of ResNet-18 to SAR images directly, the optical remote sensing land cover dataset narrows the gap between SAR and natural images which results in a significant improvement in feature transferability, and the proposed combined loss function is successful in accelerating the training process, and is reducing the model bias to noisy labels. The trained deep Residual CNN model shows a good generalization for other SAR image processing tasks, including MSTAR target recognition, land cover, and land use localization. Based on this pre-trained model, we transferred the first two residual blocks to extract the mid-level representative spatial features from the intensity images of single-look complex (SLC) SAR data, which have a similar resolution and pixel spacing along range and azimuth directions to avoid large distortions. Then, a joint time-frequency analysis was applied to SLC data to obtain a 4-D representation with information in all sub-bands, where the radar spectrograms reveal the backscattering diversity versus range and azimuth frequencies of objects on the ground. A stacked convolutional auto-encoder was designed to learn the latent features from the radar spectrograms in the frequency domain, related to physical target properties. Later, the frequency features were spatially aligned corresponding to the spatial information in the 4-D representation to be fused with the transferred spatial features. A post-learning sub-net consisting of two bottleneck residual blocks was designed to make the final decisions. This is the first time to exploit the full use of single-polarization SLC SAR data in deep learning. Compared with conventional CNNs which are based on intensity information only, the proposed DSN shows a superior performance in SAR image land cover and land use classification, especially for man-made objects. In some cases, the shapes and textures are similar to intensity images which confuse CNNs to make a right decision, but the spectrogram amplitudes present prominently different characteristics, helping DSNs to reach a better understanding of the objects on the ground. On the other hand, for natural surfaces, the radar spectrograms present similar backscattering patterns without a specific mechanism for distinguishing the features in the frequency domain, so that they cannot provide enough extra information on natural surfaces to support the interpretation of SAR images. The experiments are conducted on Sentinel-1 Stripmap SAR images and we believe the proposed DSN can be also applied to TerraSAR-X SLC data

    Transfer Learning with Deep Convolutional Neural Network for SAR Target Classification with Limited Labeled Data

    No full text
    Tremendous progress has been made in object recognition with deep convolutional neural networks (CNNs), thanks to the availability of large-scale annotated dataset. With the ability of learning highly hierarchical image feature extractors, deep CNNs are also expected to solve the Synthetic Aperture Radar (SAR) target classification problems. However, the limited labeled SAR target data becomes a handicap to train a deep CNN. To solve this problem, we propose a transfer learning based method, making knowledge learned from sufficient unlabeled SAR scene images transferrable to labeled SAR target data. We design an assembled CNN architecture consisting of a classification pathway and a reconstruction pathway, together with a feedback bypass additionally. Instead of training a deep network with limited dataset from scratch, a large number of unlabeled SAR scene images are used to train the reconstruction pathway with stacked convolutional auto-encoders (SCAE) at first. Then, these pre-trained convolutional layers are reused to transfer knowledge to SAR target classification tasks, with feedback bypass introducing the reconstruction loss simultaneously. The experimental results demonstrate that transfer learning leads to a better performance in the case of scarce labeled training data and the additional feedback bypass with reconstruction loss helps to boost the capability of classification pathway

    Few-Shot PolSAR Ship Detection Based on Polarimetric Features Selection and Improved Contrastive Self-Supervised Learning

    No full text
    Deep learning methods have been widely studied in the field of polarimetric synthetic aperture radar (PolSAR) ship detection over the past few years. However, the backscattering of manmade targets, including ships, is sensitive to the relative geometry between target orientation and radar line of sight, which makes the diversity of polarimetric and spatial features of ships. The diversity of scattering leads to a relative increase in the scarcity of PolSAR-labeled samples, which are difficult to obtain. To solve the abovementioned issue and extract the polarimetric and spatial features of PolSAR images better, this paper proposes a few-shot PolSAR ship detection method based on the combination of constructed polarimetric input data selection and improved contrastive self-supervised learning (CSSL) pre-training. Specifically, eight polarimetric feature extraction methods are adopted to construct deep learning network input data with polarimetric features. The backbone is pre-trained with un-labeled PolSAR input data through an improved CSSL method without negative samples, which enhances the representation capability by the multi-scale feature fusion module (MFFM) and implements a regularization strategy by the mix-up auxiliary pathway (MUAP). The pre-trained backbone is applied to the downstream ship detection network; only a few labeled samples are used for fine-tuning and the construction method of polarimetric input data with the best detection effect is studied. The comparison and ablation experiment results on the self-established PolSAR ship detection dataset verify the superiority of the proposed method, especially in the case of few-shot learning

    Ship Detection in Gaofen-3 SAR Images Based on Sea Clutter Distribution Analysis and Deep Convolutional Neural Network

    No full text
    Target detection is one of the important applications in the field of remote sensing. The Gaofen-3 (GF-3) Synthetic Aperture Radar (SAR) satellite launched by China is a powerful tool for maritime monitoring. This work aims at detecting ships in GF-3 SAR images using a new land masking strategy, the appropriate model for sea clutter and a neural network as the discrimination scheme. Firstly, the fully convolutional network (FCN) is applied to separate the sea from the land. Then, by analyzing the sea clutter distribution in GF-3 SAR images, we choose the probability distribution model of Constant False Alarm Rate (CFAR) detector from K-distribution, Gamma distribution and Rayleigh distribution based on a tradeoff between the sea clutter modeling accuracy and the computational complexity. Furthermore, in order to better implement CFAR detection, we also use truncated statistic (TS) as a preprocessing scheme and iterative censoring scheme (ICS) for boosting the performance of detector. Finally, we employ a neural network to re-examine the results as the discrimination stage. Experiment results on three GF-3 SAR images verify the effectiveness and efficiency of this approach

    Cloudformer: A Cloud-Removal Network Combining Self-Attention Mechanism and Convolution

    No full text
    Optical remote-sensing images have a wide range of applications, but they are often obscured by clouds, which affects subsequent analysis. Therefore, cloud removal becomes a necessary preprocessing step. In this paper, a novel and superior transformer-based network is proposed, named Cloudformer. The proposed method novelly combines the advantages of convolution and a self-attention mechanism: it uses convolution layers to extract simple features over a small range in the shallow layer, and exerts the advantage of a self-attention mechanism in extracting correlation in a large range in the deep layer. This method also introduces Locally-enhanced Positional Encoding (LePE) to flexibly generate suitable positional encodings for different inputs and to utilize local information to enhance encoding capabilities. Exhaustive experiments on public datasets demonstrate the superior ability of the method to remove both thin and thick clouds, and the effectiveness of the proposed modules is validated by ablation studies

    Infrared Dim and Small Target Detection from Complex Scenes via Multi-Frame Spatial–Temporal Patch-Tensor Model

    No full text
    Infrared imaging plays an important role in space-based early warning and anti-missile guidance due to its particular imaging mechanism. However, the signal-to-noise ratio of the infrared image is usually low and the target is moving, which makes most of the existing methods perform inferiorly, especially in very complex scenes. To solve these difficulties, this paper proposes a novel multi-frame spatial–temporal patch-tensor (MFSTPT) model for infrared dim and small target detection from complex scenes. First, the method of simultaneous sampling in spatial and temporal domains is adopted to make full use of the information between multi-frame images, establishing an image-patch tensor model that makes the complex background more in line with the low-rank assumption. Secondly, we propose utilizing the Laplace method to approximate the rank of the tensor, which is more accurate. Third, to suppress strong interference and sparse noise, a prior weighted saliency map is established through a weighted local structure tensor, and different weights are assigned to the target and background. Using an alternating direction method of multipliers (ADMM) to solve the model, we can accurately separate the background and target components and acquire the detection results. Through qualitative and quantitative analysis, experimental results of multiple real sequences verify the rationality and effectiveness of the proposed algorithm

    Automatic Color Correction for Multisource Remote Sensing Images with Wasserstein CNN

    No full text
    In this paper a non-parametric model based on Wasserstein CNN is proposed for color correction. It is suitable for large-scale remote sensing image preprocessing from multiple sources under various viewing conditions, including illumination variances, atmosphere disturbances, and sensor and aspect angles. Color correction aims to alter the color palette of an input image to a standard reference which does not suffer from the mentioned disturbances. Most of current methods highly depend on the similarity between the inputs and the references, with respect to both the contents and the conditions, such as illumination and atmosphere condition. Segmentation is usually necessary to alleviate the color leakage effect on the edges. Different from the previous studies, the proposed method matches the color distribution of the input dataset with the references in a probabilistic optimal transportation framework. Multi-scale features are extracted from the intermediate layers of the lightweight CNN model and are utilized to infer the undisturbed distribution. The Wasserstein distance is utilized to calculate the cost function to measure the discrepancy between two color distributions. The advantage of the method is that no registration or segmentation processes are needed, benefiting from the local texture processing potential of the CNN models. Experimental results demonstrate that the proposed method is effective when the input and reference images are of different sources, resolutions, and under different illumination and atmosphere conditions
    corecore